Skip to content

Add launch_custom subprocess dispatcher for CustomResourceConfig#626

Queued
kmontemayor2-sc wants to merge 13 commits into
mainfrom
kmonte/custom-resource-config-pr2-launcher
Queued

Add launch_custom subprocess dispatcher for CustomResourceConfig#626
kmontemayor2-sc wants to merge 13 commits into
mainfrom
kmonte/custom-resource-config-pr2-launcher

Conversation

@kmontemayor2-sc
Copy link
Copy Markdown
Collaborator

Implements launch_custom, a thin shim that takes a populated
CustomResourceConfig and shells out via
subprocess.run(shell=True, check=True). The proto's command is
a shell snippet (so leading KEY=VALUE env assignments parse
naturally) and args[] are individually shlex.quote-d before
joining, so values containing whitespace survive the shell pass.

The dispatcher performs no template substitution: command and
args[] are taken verbatim, and any placeholder text reaches
subprocess.run literally. Consumers that want runtime-context
substitution (e.g. ${gigl:foo}) should resolve it at YAML-load
time before the proto reaches this module. No call site in the
rest of the repo invokes launch_custom yet — wiring is added in
a follow-up PR.

Introduces a new `oneof` arm on `TrainerResourceConfig` /
`InferencerResourceConfig` that lets callers describe a launcher as a
shell command + positional args, instead of a fixed-shape Vertex AI /
KFP / local resource config. The proto carries no semantics here — the
dispatcher is added in a follow-up PR; this commit only ships the
message, regenerated bindings, and the wrapper-property update so
downstream code can read `wrapper.trainer_config` and get a
`CustomResourceConfig` back.

The diff includes a long tail of cosmetic Scala changes outside
`gigl_resource_config/` because scalapbc regenerates every sibling
proto's emitted source whenever any one proto in the same directory
changes. Reviewers can scope to `CustomResourceConfig.scala` and the
`*ResourceConfig.scala` siblings that gain the new oneof case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kmontemayor2-sc
Copy link
Copy Markdown
Collaborator Author

/all_test

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

GiGL Automation

@ 23:08:16UTC : 🔄 Lint Test started.

@ 23:12:40UTC : ❌ Workflow failed.
Please check the logs for more details.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

GiGL Automation

@ 23:08:18UTC : 🔄 Python Unit Test started.

@ 23:14:32UTC : ❌ Workflow failed.
Please check the logs for more details.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

GiGL Automation

@ 23:08:18UTC : 🔄 Scala Unit Test started.

@ 23:18:08UTC : ✅ Workflow completed successfully.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

GiGL Automation

@ 23:08:22UTC : 🔄 C++ Unit Test started.

@ 23:10:11UTC : ✅ Workflow completed successfully.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

GiGL Automation

@ 23:08:24UTC : 🔄 E2E Test started.

@ 24:38:24UTC : ✅ Workflow completed successfully.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

GiGL Automation

@ 23:08:25UTC : 🔄 Integration Test started.

@ 24:24:39UTC : ✅ Workflow completed successfully.

Comment thread gigl/src/common/custom_launcher.py
@kmontemayor2-sc kmontemayor2-sc force-pushed the kmonte/custom-resource-config-pr2-launcher branch from 357c3cc to 2ef62bd Compare May 8, 2026 00:16
CustomResourceConfig is launcher-pluggable — there is no concrete
machine spec to validate against (no machine_type, num_workers, GPU
config, etc.). The wrapper's trainer_config / inferencer_config
properties now return a union that includes CustomResourceConfig
(introduced earlier in this PR), but _validate_machine_config does
not accept it.

Add an isinstance early-return guard at each call site that logs
the skip and returns. Shape and backend-compatibility validation
for CustomResourceConfig come in a follow-up PR; this commit only
makes the existing validation flow type-clean against the widened
union.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kmontemayor2-sc kmontemayor2-sc force-pushed the kmonte/custom-resource-config-pr2-launcher branch from 2ef62bd to 837dbc5 Compare May 8, 2026 00:20
Implements `launch_custom`, a thin shim that takes a populated
`CustomResourceConfig` and shells out via
`subprocess.run(shell=True, check=True)`. The proto's `command` is
a shell snippet (so leading `KEY=VALUE` env assignments parse
naturally) and `args[]` are individually `shlex.quote`-d before
joining, so values containing whitespace survive the shell pass.

The dispatcher performs no template substitution: `command` and
`args[]` are taken verbatim, and any placeholder text reaches
`subprocess.run` literally. Consumers that want runtime-context
substitution (e.g. ${gigl:foo}) should resolve it at YAML-load
time before the proto reaches this module. No call site in the
rest of the repo invokes `launch_custom` yet — wiring is added in
a follow-up PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kmontemayor2-sc kmontemayor2-sc force-pushed the kmonte/custom-resource-config-pr2-launcher branch from 837dbc5 to 066c473 Compare May 8, 2026 00:21
kmontemayor2-sc and others added 7 commits May 11, 2026 11:58
…b.com/Snapchat/GiGL into kmonte/custom-resource-config-pr1-proto
The proto message describes a launcher command (command + args), not a
resource shape (machines / replicas / pools) like its sibling oneof
arms (VertexAiResourceConfig, KFPResourceConfig, LocalResourceConfig).
Aligning the name with the supporting Python (custom_launcher.py /
launch_custom) removes the misleading "Resource" suffix. Field names
on TrainerResourceConfig / InferencerResourceConfig stay
(custom_trainer_config / custom_inferencer_config) so the sibling
field-name pattern is preserved.

Regenerated pb2 / pyi / Scala bindings. Wrapper + validator + test
references updated.
Follow-up to the proto rename. Updates the launcher module's import,
parameter type annotation, docstring, and error message, plus its
test file. Also renames the launcher parameter from
`custom_resource_config` to `custom_launcher_config` to match the
new proto type name.
Comment thread tests/unit/src/common/custom_launcher_test.py Outdated
Comment thread tests/unit/src/common/custom_launcher_test.py
Copy link
Copy Markdown
Collaborator

@mkolodner-sc mkolodner-sc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stamp

@kmontemayor2-sc kmontemayor2-sc marked this pull request as ready for review May 14, 2026 14:24
@kmontemayor2-sc kmontemayor2-sc enabled auto-merge May 14, 2026 14:24
@kmontemayor2-sc kmontemayor2-sc added this pull request to the merge queue May 14, 2026
Any commits made after this event will not be merged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants